{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f71d19de",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/vector_stores/SimpleIndexDemoLlama2.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9c48213d-6e6a-4c10-838a-2a7c710c3a05",
   "metadata": {},
   "source": [
    "# Llama2 + VectorStoreIndex\n",
    "\n",
    "This notebook walks through the proper setup to use llama-2 with LlamaIndex. Specifically, we look at using a vector store index."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "91f09a23",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "b67d9bd5",
   "metadata": {},
   "source": [
    "If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fe23f913",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-llms-replicate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24fbf539",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ba765302",
   "metadata": {},
   "source": [
    "### Keys"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3d8cab38",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\"\n",
    "os.environ[\"REPLICATE_API_TOKEN\"] = \"YOUR_REPLICATE_TOKEN\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "50d3b817-b70e-4667-be4f-d3a0fe4bd119",
   "metadata": {},
   "source": [
    "### Load documents, build the VectorStoreIndex"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "690a6918-7c75-4f95-9ccc-d2c4a1fe00d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Optional logging\n",
    "# import logging\n",
    "# import sys\n",
    "\n",
    "# logging.basicConfig(stream=sys.stdout, level=logging.INFO)\n",
    "# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))\n",
    "\n",
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
    "\n",
    "from IPython.display import Markdown, display"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be92665d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.replicate import Replicate\n",
    "from llama_index.core.llms.llama_utils import (\n",
    "    messages_to_prompt,\n",
    "    completion_to_prompt,\n",
    ")\n",
    "\n",
    "# The replicate endpoint\n",
    "LLAMA_13B_V2_CHAT = \"a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5\"\n",
    "\n",
    "\n",
    "# inject custom system prompt into llama-2\n",
    "def custom_completion_to_prompt(completion: str) -> str:\n",
    "    return completion_to_prompt(\n",
    "        completion,\n",
    "        system_prompt=(\n",
    "            \"You are a Q&A assistant. Your goal is to answer questions as \"\n",
    "            \"accurately as possible is the instructions and context provided.\"\n",
    "        ),\n",
    "    )\n",
    "\n",
    "\n",
    "llm = Replicate(\n",
    "    model=LLAMA_13B_V2_CHAT,\n",
    "    temperature=0.01,\n",
    "    # override max tokens since it's interpreted\n",
    "    # as context window instead of max tokens\n",
    "    context_window=4096,\n",
    "    # override completion representation for llama 2\n",
    "    completion_to_prompt=custom_completion_to_prompt,\n",
    "    # if using llama 2 for data agents, also override the message representation\n",
    "    messages_to_prompt=messages_to_prompt,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13799473",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Settings\n",
    "\n",
    "Settings.llm = llm"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a1555336",
   "metadata": {},
   "source": [
    "Download Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03d1691e-544b-454f-825b-5ee12f7faa8a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load documents\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad144ee7-96da-4dd6-be00-fd6cf0c78e58",
   "metadata": {},
   "outputs": [],
   "source": [
    "index = VectorStoreIndex.from_documents(documents)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "b6caf93b-6345-4c65-a346-a95b0f1746c4",
   "metadata": {},
   "source": [
    "## Querying"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85466fdf-93f3-4cb1-a5f9-0056a8245a6f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# set Logging to DEBUG for more detailed outputs\n",
    "query_engine = index.as_query_engine()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bdda1b2c-ae46-47cf-91d7-3153e8d0473b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<b> Based on the context information provided, the author's activities growing up were:\n",
       "1. Writing short stories, which were \"awful\" and had \"hardly any plot.\"\n",
       "2. Programming on an IBM 1401 computer in 9th grade, using an early version of Fortran language.\n",
       "3. Building simple games, a program to predict the height of model rockets, and a word processor for his father.\n",
       "4. Reading science fiction novels, such as \"The Moon is a Harsh Mistress\" by Heinlein, which inspired him to work on AI.\n",
       "5. Living in Florence, Italy, and walking through the city's streets to the Accademia.\n",
       "\n",
       "Please note that these activities are mentioned in the text and are not based on prior knowledge or assumptions.</b>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "response = query_engine.query(\"What did the author do growing up?\")\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "24935a47",
   "metadata": {},
   "source": [
    "### Streaming Support"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "446406f9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Based on the context information provided, it appears that the author worked at Interleaf, a company that made software for creating and managing documents. The author mentions that Interleaf was \"on the way down\" and that the company's Release Engineering group was large compared to the group that actually wrote the software. It is inferred that Interleaf was experiencing financial difficulties and that the author was nervous about money. However, there is no explicit mention of what specifically happened at Interleaf."
     ]
    }
   ],
   "source": [
    "query_engine = index.as_query_engine(streaming=True)\n",
    "response = query_engine.query(\"What happened at interleaf?\")\n",
    "for token in response.response_gen:\n",
    "    print(token, end=\"\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
